Wang Haihua
🚅 🚋😜 🚑 🚔
In this example, we'll be solving a scheduling problem. We have 2 offshore production plants in 2 locations and an estimated demand for our products.
We want to produce a schedule of production from both plants that meets our demand with the lowest cost.
A factory can be in 2 states:
Both factories have fixed costs, that are incurred as long as the factory is on, and variable costs, a cost per unit of production. These vary month by month.
We also know that factory B is down for maintenance in month 5.
We'll start by importing our data.
The data is imported into a multi-index pandas DataFrame using 'Month' and 'Factory' as our index columns.
import pandas as pd
import pulp
factories = pd.read_csv('data/factory_variables.csv', index_col=['Month', 'Factory'])
factories
Max_Capacity | Min_Capacity | Variable_Costs | Fixed_Costs | ||
---|---|---|---|---|---|
Month | Factory | ||||
1 | A | 100000 | 20000 | 10 | 500 |
B | 50000 | 20000 | 5 | 600 | |
2 | A | 110000 | 20000 | 11 | 500 |
B | 55000 | 20000 | 4 | 600 | |
3 | A | 120000 | 20000 | 12 | 500 |
B | 60000 | 20000 | 3 | 600 | |
4 | A | 145000 | 20000 | 9 | 500 |
B | 100000 | 20000 | 5 | 600 | |
5 | A | 160000 | 20000 | 8 | 500 |
B | 0 | 0 | 0 | 0 | |
6 | A | 140000 | 20000 | 8 | 500 |
B | 70000 | 20000 | 6 | 600 | |
7 | A | 155000 | 20000 | 5 | 500 |
B | 60000 | 20000 | 4 | 600 | |
8 | A | 200000 | 20000 | 7 | 500 |
B | 100000 | 20000 | 6 | 600 | |
9 | A | 210000 | 20000 | 9 | 500 |
B | 100000 | 20000 | 8 | 600 | |
10 | A | 197000 | 20000 | 10 | 500 |
B | 100000 | 20000 | 11 | 600 | |
11 | A | 80000 | 20000 | 8 | 500 |
B | 120000 | 20000 | 10 | 600 | |
12 | A | 150000 | 20000 | 8 | 500 |
B | 150000 | 20000 | 12 | 600 |
We'll also import our demand data
demand = pd.read_csv('data/monthly_demand.csv', index_col=['Month'])
demand.T
Month | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Demand | 120000 | 100000 | 130000 | 130000 | 140000 | 130000 | 150000 | 170000 | 200000 | 190000 | 140000 | 100000 |
As we have fixed costs and variable costs, we'll need to model both production and the status of the factory i.e. whether it is on or off.
Production is modelled as an integer variable.
We have a value for production for each month for each factory, this is given by the tuples of our multi-index pandas DataFrame index.
production = pulp.LpVariable.dicts("production",
((month, factory) for month, factory in factories.index),
lowBound=0,
cat='Integer')
Factory status is modelled as a binary variable. It will have a value of 1 if the factory is on and a value of 0 when the factory is off.
Binary variables are the same as integer variables but constrained to be >= 0 and <=1
Again this has a value for each month for each factory, again given by the index of our DataFrame
factory_status = pulp.LpVariable.dicts("factory_status",
((month, factory) for month, factory in factories.index),
cat='Binary')
We instantiate our model and use LpMinimize as the aim is to minimise costs.
model = pulp.LpProblem("Cost minimising scheduling problem", pulp.LpMinimize)
D:\software_install\anaconda\lib\site-packages\pulp\pulp.py:1199: UserWarning: Spaces are not permitted in the name. Converted to '_' warnings.warn("Spaces are not permitted in the name. Converted to '_'")
In our objective function we include our 2 costs:
model += pulp.lpSum(
[production[month, factory] * factories.loc[(month, factory), 'Variable_Costs'] for month, factory in factories.index]
+ [factory_status[month, factory] * factories.loc[(month, factory), 'Fixed_Costs'] for month, factory in factories.index]
)
We build up our constraints
# Production in any month must be equal to demand
months = demand.index
for month in months:
model += production[(month, 'A')] + production[(month, 'B')] == demand.loc[month, 'Demand']
An issue we run into here is that in linear programming we can't use conditional constraints.
For example we can't add to our model that if the factory is off factory status must be 0, and if it is on factory status must be 1. Before we've solved our model though, we don't know if the factory will be on or off in a given month.
In this case construct constraints that have minimum and maximum capacities that are constant variables, which we multiply by the factory status.
Now, either factory status is 0 and:
Or factory status is 1 and:
(In some cases we can use linear constraints to model conditional statements, we'll explore this in part 6)
# Production in any month must be between minimum and maximum capacity, or zero.
for month, factory in factories.index:
min_production = factories.loc[(month, factory), 'Min_Capacity']
max_production = factories.loc[(month, factory), 'Max_Capacity']
model += production[(month, factory)] >= min_production * factory_status[month, factory]
model += production[(month, factory)] <= max_production * factory_status[month, factory]
# Factory B is off in May
model += factory_status[5, 'B'] == 0
model += production[5, 'B'] == 0
We then solve the model
model.solve()
pulp.LpStatus[model.status]
'Optimal'
Let's take a look at the optimal production schedule output for each month from each factory. For ease of viewing we'll output the data to a pandas DataFrame.
output = []
for month, factory in production:
var_output = {
'Month': month,
'Factory': factory,
'Production': production[(month, factory)].varValue,
'Factory Status': factory_status[(month, factory)].varValue
}
output.append(var_output)
output_df = pd.DataFrame.from_records(output).sort_values(['Month', 'Factory'])
output_df.set_index(['Month', 'Factory'], inplace=True)
output_df
Production | Factory Status | ||
---|---|---|---|
Month | Factory | ||
1 | A | 70000.0 | 1.0 |
B | 50000.0 | 1.0 | |
2 | A | 45000.0 | 1.0 |
B | 55000.0 | 1.0 | |
3 | A | 70000.0 | 1.0 |
B | 60000.0 | 1.0 | |
4 | A | 30000.0 | 1.0 |
B | 100000.0 | 1.0 | |
5 | A | 140000.0 | 1.0 |
B | 0.0 | 0.0 | |
6 | A | 60000.0 | 1.0 |
B | 70000.0 | 1.0 | |
7 | A | 90000.0 | 1.0 |
B | 60000.0 | 1.0 | |
8 | A | 70000.0 | 1.0 |
B | 100000.0 | 1.0 | |
9 | A | 100000.0 | 1.0 |
B | 100000.0 | 1.0 | |
10 | A | 190000.0 | 1.0 |
B | 0.0 | 0.0 | |
11 | A | 80000.0 | 1.0 |
B | 60000.0 | 1.0 | |
12 | A | 100000.0 | 1.0 |
B | 0.0 | 0.0 |
Notice above that the factory status is 0 when not producing and 1 when it is producing
# Print our objective function value (Total Costs)
print(pulp.value(model.objective))
12906400.0